Web Information Acquisition with Lixto Suite: A Demonstration∗
نویسندگان
چکیده
We demonstrate the Lixto Suite, a web data extraction and transformation software kit for retrieving and converting information from various sources to various customer devices. With the Lixto Suite, non-technical content managers can rapidly develop applications in the areas of M-Commerce, E-Commerce, content integration and corporate portals.
منابع مشابه
Lixto – Price Intelligence Suite
IMPACT The Lixto Price Intelligence Suite is a solution that extracts price-comparison information from competitor online web channels and combines it with internal data sources so organizations gain greater visibility into the factors that might influence price. The suite works by navigating and extracting competitive product and pricing information from predefined online data sources before s...
متن کاملVisual Web Information Extraction with Lixto
We present new techniques for supervised wrapper generation and automated web information extraction, and a system called Lixto implementing these techniques. Our system can generate wrappers which translate relevant pieces of HTML pages into XML. Lixto, of which a working prototype has been implemented, assists the user to semi-automatically create wrapper programs by providing a fully visual ...
متن کاملApplying ASP Inferential Engines to the Filtering, Decoration and Validation of Data from Web Sources
We propose a software architecture for semantics-based annotation of data extracted fromWeb sources. Data can be extracted from arbitrary Web sources thanks to two wrapping engines: the LiXto suite, which supports semi-automated data extraction and XML formatting, and Dynamo, which is based on the novel approach of HTML metatag decoration. The XML tags produced by the wrappers are then fed to a...
متن کاملDeclarative Web data extraction and annotation
We propose a software architecture for semantics-based annotation of data extracted from Web sources. Starting from the LiXto suite, which enables semi-automated extraction of XML data from regular documents, we present a solution for attaching background information to individual tags by means of so-called decorations. Decoration is carried out as an inferential activity in the formal context ...
متن کاملDeclarative Information Extraction, Web Crawling, and Recursive Wrapping with Lixto
Lixto is a system and method for the visual and interactive generation of wrappers for Web pages under the supervision of a human developer, for automatically extracting information from Web pages using such wrappers, and for translating the extracted content into XML. This paper describes some advanced features of Lixto, such as disjunctive pattern definitions, specialization rules, and Lixto’...
متن کامل